feat(traces): add reranking span kind for document reranking in llama index #1588

RogerHYang · 2023-10-07T00:08:11Z

resolves #1153

Screen.Recording.2023-10-06.at.5.00.57.PM.mov

axiomofjoy

Python code lgtm. Any reason to keep the test you added as an integration test, or can it be made into a unit test?

Let's get a review from @mikeldking for the frontend.

app/src/pages/trace/TracePage.tsx

axiomofjoy · 2023-10-10T19:03:55Z

Looks awesome @RogerHYang !

app/schema.graphql

mikeldking · 2023-10-10T23:21:17Z

we need a followup for langchain too

mikeldking · 2023-10-10T23:33:18Z

The LLM reranker of Llama_index doesn't get the same attributes as the CohereReranker - I think if we are going to calculate things like NCDG we will have to at least capture the attributes from both. Do you know what the root cause of his is?

review-notebook-app · 2023-10-10T23:51:32Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

mikeldking

LGTM! I filed some follow-up tickets. Might do a quick audit on the UI post any changes but looks good!

mikeldking · 2023-10-11T23:36:57Z

app/src/pages/trace/TracePage.tsx

+        <CodeBlock value={query} mimeType="text" />
+      </Card>
+      <Card
+        title={`Input Documents (${numInputDocuments})`}


there is a titleExtra prop on card where you can place a Counter component. https://5f9739a76e154c00220dd4b9-zeknbennzf.chromatic.com/?path=/story/counter--gallery

mikeldking · 2023-10-11T23:37:53Z

app/src/pages/trace/TracePage.tsx

+        title={`Re-ranked Documents (${numOutputDocuments})`}
+        {...defaultCardProps}


Same as above - rely on titleExtra and counter.

I added it and it looks like this.

app/src/pages/trace/TracePage.tsx

mikeldking · 2023-10-11T23:40:47Z

app/src/pages/trace/TracePage.tsx

+            {input_documents.map((document, idx) => {
+              return (
+                <li key={idx}>
+                  <DocumentItem document={document} />


I might re-color these just so that there's a visual hierarchy of color (e.g. that re-ranked documents take on a different tint) - this way as you are clicking around you can clearly see the difference.

Re-using the DocumentItem component is good but I think just showing the new score label might be a tad confusing? Or just using score as an abstract is intended here.

In one case it's often spacial distance where as when running through a reranker it is a relevance rank. Just thinking that from a user's perspective displaying score: XX alongside both we lose a bit of an opportunity to explain the score in this context a bit better - score being pretty generic.

I think even though score is generic, it is still accurate. On the input side of the reranker, score may or may not exist, and even if it does exist, it's not considered by the reranker. But if the "input" score does exist it was generated by a preprocessor for a separate purpose. The general mental picture here is that there could be millions of documents in a corpus, and only a relatively small set are chosen to be reranked, and that selection process can have a score of its own based on the query in question. Even though that score is not meaningful to the reranker, it is still an informative attribute of the input document, because it relays the reason for how the document became a candidate in the first place (especially when the preprocessor is missing in the trace). On the other hand, we can't really get more specific that the score verbiage because we don't have more information. On balance, although it may seem confusing at first, a user should have enough context to reason their way through it.

I think I wasn't disputing the way we capture the score - was just thinking of ways to avoid the mental "eason their way through it." a bit. But I don't have an immediate good prefix for the reranker score so let's keep it for now.

mikeldking · 2023-10-11T23:46:49Z

integration-tests/trace/llama_index/test_callback.py

+    RERANKING_INPUT_DOCUMENTS,
+    RERANKING_MODEL_NAME,
+    RERANKING_OUTPUT_DOCUMENTS,
+    RERANKING_TOP_K,


quick question on this parameter: https://docs.cohere.com/docs/reranking

I'm guessing this is the same as TOP_N? If you feed say 5 documents but pass top_k of 3, does it only rank 3? Just trying to understand why this is a parameter to the rerank model.

Yes, this is the same as TOP_N. (The letter is K in literature because N is usually the total number of docs.) The caller of the reranker usually just wants a relatively small number of docs out of a initial set of tens or hundreds. It's certainly optional because it can just rank each document, but in general, a reduction in number is expected for each stage of the retrieval process.

Still confused though - if I pass say 5 documents with top_k of 3 - does it rank 5 and trim the last two?

Yes, it retains up to K in the output, so in the case of top 3, two docs of the lowest scores have to be dropped. Top K is applied after ranking all 5 docs.

mikeldking · 2023-10-12T18:39:45Z

tutorials/internal/llama_index_tracing_example.ipynb

@@ -76,11 +75,7 @@
    "if not (openai_api_key := os.getenv(\"OPENAI_API_KEY\")):\n",
    "    openai_api_key = getpass(\"🔑 Enter your OpenAI API key: \")\n",
    "openai.api_key = openai_api_key\n",
-    "os.environ[\"OPENAI_API_KEY\"] = openai_api_key\n",
-    "\n",
-    "if not (cohere_api_key := os.getenv(\"COHERE_API_KEY\")):\n",


this is the internal one - it seemed nice to have a notebook that executed a reranker?

oh, my bad. I had mistaken this one with the public notebook llama_index_tracing_tutorial.ipynb which has a very similar name.

This reverts commit 58cf84c.

mikeldking

RogerHYang and others added 2 commits October 6, 2023 17:05

add reranking span

04e6ab2

Merge branch 'main' into llama-index-reranking

245441c

axiomofjoy reviewed Oct 10, 2023

View reviewed changes

app/src/pages/trace/TracePage.tsx Outdated Show resolved Hide resolved

app/src/pages/trace/TracePage.tsx Outdated Show resolved Hide resolved

mikeldking reviewed Oct 10, 2023

View reviewed changes

app/schema.graphql Outdated Show resolved Hide resolved

add icon and color

5019bf6

mikeldking and others added 2 commits October 10, 2023 17:53

remove console.log

3d2885d

Merge branch 'main' into llama-index-reranking

86c82a3

mikeldking reviewed Oct 11, 2023

View reviewed changes

RogerHYang added 5 commits October 12, 2023 07:42

Merge branch 'main' into llama-index-reranking

1bb7467

add titleExtra

71e34b0

rename reranking to reranker

8df4c04

clean up

b79bbdf

revert demo notebook

58cf84c

mikeldking reviewed Oct 12, 2023

View reviewed changes

RogerHYang and others added 3 commits October 12, 2023 12:12

Revert "revert demo notebook"

58294db

This reverts commit 58cf84c.

bump deps

374fa32

add rerank fixture, color reranked documents differently

c50c3d3

mikeldking approved these changes Oct 12, 2023

View reviewed changes

RogerHYang merged commit 1708748 into main Oct 12, 2023
10 checks passed

RogerHYang deleted the llama-index-reranking branch October 12, 2023 21:10

github-actions bot locked and limited conversation to collaborators Oct 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(traces): add reranking span kind for document reranking in llama index #1588

feat(traces): add reranking span kind for document reranking in llama index #1588

RogerHYang commented Oct 7, 2023

axiomofjoy left a comment

axiomofjoy commented Oct 10, 2023

mikeldking commented Oct 10, 2023

mikeldking commented Oct 10, 2023

review-notebook-app bot commented Oct 10, 2023

mikeldking left a comment

mikeldking Oct 11, 2023

mikeldking Oct 11, 2023

RogerHYang Oct 12, 2023

mikeldking Oct 11, 2023

mikeldking Oct 11, 2023

mikeldking Oct 12, 2023

RogerHYang Oct 12, 2023

mikeldking Oct 12, 2023

mikeldking Oct 11, 2023

RogerHYang Oct 12, 2023

mikeldking Oct 12, 2023

RogerHYang Oct 12, 2023 •

edited

Loading

mikeldking Oct 12, 2023

RogerHYang Oct 12, 2023 •

edited

Loading

RogerHYang Oct 12, 2023

mikeldking left a comment

		title={`Re-ranked Documents (${numOutputDocuments})`}
		{...defaultCardProps}

feat(traces): add reranking span kind for document reranking in llama index #1588

feat(traces): add reranking span kind for document reranking in llama index #1588

Conversation

RogerHYang commented Oct 7, 2023

axiomofjoy left a comment

Choose a reason for hiding this comment

axiomofjoy commented Oct 10, 2023

mikeldking commented Oct 10, 2023

mikeldking commented Oct 10, 2023

review-notebook-app bot commented Oct 10, 2023

mikeldking left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RogerHYang Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RogerHYang Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mikeldking left a comment

Choose a reason for hiding this comment

RogerHYang Oct 12, 2023 •

edited

Loading

RogerHYang Oct 12, 2023 •

edited

Loading